Mochi: Visual Log-Analysis Based Tools for Debugging Hadoop (CMU-PDL-09-103)

نویسندگان

Jiaqui Tan

Xinghao Pan

Soila Kavulya

Rajeev Gandhi

Priya Narasimhan

Jiaqi Tan

چکیده

Mochi, a new visual, log-analysis based debugging tool correlates Hadoop’s behavior in space, time and volume, and extracts a causal, unified controland data-flow model of Hadoop across the nodes of a cluster. Mochi’s analysis produces visualizations of Hadoop’s behavior using which users can reason about and debug performance issues. We provide examples of Mochi’s value in revealing a Hadoop job’s structure, in optimizing real-world workloads, and in identifying anomalous Hadoop behavior, on the Yahoo! M45 Hadoop cluster. Acknowledgements: The authors would like to acknowledge Christos Faloutsos and U Kang for discussions on the HADI Hadoop workload and for providing log data. This research was partially funded by the Defence Science & Technology Agency, Singapore, via the DSTA Overseas Scholarship, and sponsored in part by the National Science Foundation, via CAREER grant CCR-0238381 and grant CNS-0326453.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mochi: Visual Log-Analysis Based Tools for Debugging Hadoop

Mochi, a new visual, log-analysis based debugging tool correlates Hadoop’s behavior in space, time and volume, and extracts a causal, unified controland dataflow model of Hadoop across the nodes of a cluster. Mochi’s analysis produces visualizations of Hadoop’s behavior using which users can reason about and debug performance issues. We provide examples of Mochi’s value in revealing a Hadoop jo...

متن کامل

1Mochi: Visual Log-Analysis Based Tools for Debugging Hadoop

متن کامل

DiskReduce: RAID for Data-Intensive Scalable Computing (CMU-PDL-09-112)

Data-intensive file systems, developed for Internet services and popular in cloud computing, provide high reliability and availability by replicating data, typically three copies of everything. Alternatively high performance computing, which has comparable scale, and smaller scale enterprise storage systems get similar tolerance for multiple failures from lower overhead erasure encoding, or RAI...

متن کامل

ASDF: Automated, Online Fingerpointing for Hadoop (CMU-PDL-08-104)

Localizing performance problems (or fingerpointing) is essential for distributed systems such as Hadoop that support long-running, parallelized, data-intensive computations over a large cluster of nodes. Manual fingerpointing does not scale in such environments because of the number of nodes and the number of performance metrics to be analyzed on each node. ASDF is an automated, online fingerpo...

متن کامل

RAMS and BlackSheep: Inferring White-box Application Behavior Using Black-box Techniques (CMU-PDL-08-103)

A significant challenge in developing automated problem-diagnosis tools for distributed systems is the ability of these tools to differentiate between changes in system behavior due to workload changes from those due to faults. To address this challenge, current, typically white-box, techniques extract semantically-rich knowledge about the target application through fairly invasive, high-overhe...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2015

Mochi: Visual Log-Analysis Based Tools for Debugging Hadoop (CMU-PDL-09-103)

نویسندگان

چکیده

منابع مشابه

Mochi: Visual Log-Analysis Based Tools for Debugging Hadoop

1Mochi: Visual Log-Analysis Based Tools for Debugging Hadoop

DiskReduce: RAID for Data-Intensive Scalable Computing (CMU-PDL-09-112)

ASDF: Automated, Online Fingerpointing for Hadoop (CMU-PDL-08-104)

RAMS and BlackSheep: Inferring White-box Application Behavior Using Black-box Techniques (CMU-PDL-08-103)

عنوان ژورنال:

اشتراک گذاری